Goto

Collaborating Authors

 alternative data


The Challenger: When Do New Data Sources Justify Switching Machine Learning Models?

Digalakis, Vassilis Jr, Pérignon, Christophe, Saurin, Sébastien, Sentenac, Flore

arXiv.org Machine Learning

We study the problem of deciding whether, and when an organization should replace a trained incumbent model with a challenger relying on newly available features. We develop a unified economic and statistical framework that links learning-curve dynamics, data-acquisition and retraining costs, and discounting of future gains. First, we characterize the optimal switching time in stylized settings and derive closed-form expressions that quantify how horizon length, learning-curve curvature, and cost differentials shape the optimal decision. Second, we propose three practical algorithms--a one-shot baseline, a greedy sequential method, and a look-ahead sequential method. Using a real-world credit-scoring dataset with gradually arriving alternative data, we show that (i) optimal switching times vary systematically with cost parameters and learning-curve behavior, and (ii) the look-ahead sequential method outperforms other methods and is able to approach in value an oracle with full foresight. Finally, we establish finite-sample guarantees, including conditions under which the sequential look-ahead method achieve sublinear regret relative to that oracle. Our results provide an operational blueprint for economically sound model transitions as new data sources become available.


Debiasing Alternative Data for Credit Underwriting Using Causal Inference

Lam, Chris

arXiv.org Artificial Intelligence

Alternative data provides valuable insights for lenders to evaluate a borrower's creditworthiness, which could help expand credit access to underserved groups and lower costs for borrowers. But some forms of alternative data have historically been excluded from credit underwriting because it could act as an illegal proxy for a protected class like race or gender, causing redlining. We propose a method for applying causal inference to a supervised machine learning model to debias alternative data so that it might be used for credit underwriting. We demonstrate how our algorithm can be used against a public credit dataset to improve model accuracy across different racial groups, while providing theoretically robust nondiscrimination guarantees.


Large Investment Model

Guo, Jian, Shum, Heung-Yeung

arXiv.org Artificial Intelligence

Traditional quantitative investment research is encountering diminishing returns alongside rising labor and time costs. To overcome these challenges, we introduce the Large Investment Model (LIM), a novel research paradigm designed to enhance both performance and efficiency at scale. LIM employs end-to-end learning and universal modeling to create an upstream foundation model capable of autonomously learning comprehensive signal patterns from diverse financial data spanning multiple exchanges, instruments, and frequencies. These "global patterns" are subsequently transferred to downstream strategy modeling, optimizing performance for specific tasks. We detail the system architecture design of LIM, address the technical challenges inherent in this approach, and outline potential directions for future research. The advantages of LIM are demonstrated through a series of numerical experiments on cross-instrument prediction for commodity futures trading, leveraging insights from stock markets.


Quant 4.0: Engineering Quantitative Investment with Automated, Explainable and Knowledge-driven Artificial Intelligence

Guo, Jian, Wang, Saizhuo, Ni, Lionel M., Shum, Heung-Yeung

arXiv.org Artificial Intelligence

Quantitative investment (``quant'') is an interdisciplinary field combining financial engineering, computer science, mathematics, statistics, etc. Quant has become one of the mainstream investment methodologies over the past decades, and has experienced three generations: Quant 1.0, trading by mathematical modeling to discover mis-priced assets in markets; Quant 2.0, shifting quant research pipeline from small ``strategy workshops'' to large ``alpha factories''; Quant 3.0, applying deep learning techniques to discover complex nonlinear pricing rules. Despite its advantage in prediction, deep learning relies on extremely large data volume and labor-intensive tuning of ``black-box'' neural network models. To address these limitations, in this paper, we introduce Quant 4.0 and provide an engineering perspective for next-generation quant. Quant 4.0 has three key differentiating components. First, automated AI changes quant pipeline from traditional hand-craft modeling to the state-of-the-art automated modeling, practicing the philosophy of ``algorithm produces algorithm, model builds model, and eventually AI creates AI''. Second, explainable AI develops new techniques to better understand and interpret investment decisions made by machine learning black-boxes, and explains complicated and hidden risk exposures. Third, knowledge-driven AI is a supplement to data-driven AI such as deep learning and it incorporates prior knowledge into modeling to improve investment decision, in particular for quantitative value investing. Moreover, we discuss how to build a system that practices the Quant 4.0 concept. Finally, we propose ten challenging research problems for quant technology, and discuss potential solutions, research directions, and future trends.


4 Ways Alternative Data Is Improving Fintech Companies in APAC - Fintech Hong Kong

#artificialintelligence

Various categories of fintech firms – Buy Now, Pay Later (BNPL), digital lending, payments and collections – are increasingly leveraging predictive models built using artificial intelligence and machine learning to support core business functions such as risk decisioning. According to a report by Grand View Research, Inc., the global AI in fintech market size is expected to reach US$41.16 billion by 2030, growing at a compound annual growth rate (CAGR) of 19.7% in Asia-Pacific alone from 2022 to 2030. The success of AI in fintech, or any business for that matter, hinges on an organisation's ability to make accurate predictions based on data. While internal data (first-party data) needs to be factored into AI models, this data often fails to capture critical predictive features, causing these models to underperform. In these situations, alternative data and feature enrichment can establish a powerful advantage.


Unleashing the power of machine learning models in banking through explainable artificial intelligence (XAI)

#artificialintelligence

The "black-box" conundrum is one of the biggest roadblocks preventing banks from executing their artificial intelligence (AI) strategies. It's easy to see why: Picture a large bank known for its technology prowess designing a new neural network model that predicts creditworthiness among the underserved community more accurately than any other algorithm in the marketplace. This model processes dozens of variables as inputs, including never-before-used alternative data. The developers are thrilled, senior management is happy that they can expand their services to the underserved market, and business executives believe they now have a competitive differentiator. But there is one pesky problem: The developers who built the model cannot explain how it arrives at the credit outcomes, let alone identify which factors had the biggest influence on them.


Fraud prevention is the biggest driver for investments in AI

#artificialintelligence

Provenir, a global leader in AI-powered risk decisioning software for the fintech industry, has found in its latest study that fraud prevention is the biggest driver for investments in AI-enabled risk decisions this year. The survey, which offers the views of 100 decision-makers from fintechs and financial services firms across Europe, found that other major drivers for investments in AI-enabled risk decisioning include automating decisions across the credit lifecycle (68%), competitive pricing (65%) and cost savings and operational efficiency (61%). The survey highlighted the role that alternative data can play in the fight against fraud, with 68% of those surveyed choosing to incorporate alternative data for the purpose of improving fraud detection. It also found that access to data is the biggest challenge to an organisation's risk strategy (88%), closely followed by a lack of a centralised view of data across the customer lifecycle (74%). "The risk of fraud has heightened across the entire financial services landscape, and with attacks only becoming more sophisticated and widespread, it is positive to see that more firms are turning to AI-enabled technologies to minimise these threats," said Carol Hamilton, SVP, Global Solutions at Provenir.


Artificial Intelligence in trading: The lowest-hanging fruit

#artificialintelligence

I recently spoke with an elderly gentleman -- a trader and fund owner. This conversation inspired me to write an article about Artificial Intelligence tools that are used in trading today. His fund employs over a dozen traders investing in various markets, and he is a veteran of oil trading. His trading style is conservative -- after finding a signal, he opens a trade holding a single position, sometimes for several weeks. I want to show you where and how you can use the most modern solutions in this example.


Artificial Intelligence as a Catalyst to Accelerate Financial Inclusion - Fintech Singapore

#artificialintelligence

The use of Artificial Intelligence (AI) in financial services is all over the news, with some reports estimating it to be a US$450 billion opportunity. But what's the real story around what AI can do? Beyond just automating certain processes, AI has the potential to improve accuracy in credit or risk decisioning workflows, encouraging financial inclusion and allowing the underbanked and unbanked access to financial services in ways that were previously unreachable. Over 3 billion people in Asia have no access to formal credit and three of the top ten'most unbanked' countries in the world happen to be located in APAC (Vietnam, the Philippines and Indonesia). Finding innovative ways to enable more access to financial services is critical.


CFPB warnings of bias in AI could spook lenders

#artificialintelligence

Rohit Chopra has seized on nearly every public opportunity as director of the Consumer Financial Protection Bureau to admonish companies about the potential misuse of artificial intelligence in lending decisions. Chopra has said that algorithms can never "be free of bias" and may result in credit determinations that are unfair to consumers. He claims machine learning can be anti-competitive and could lead to "digital redlining" and "robo discrimination." The message for banks and fast-moving fintechs is loud and clear: Enforcement actions related to the use of AI are coming, as is potential guidance tied to what makes alternative data such as utility and rent payments risky when used in marketing, pricing and underwriting products, experts say. "The focus on artificial intelligence and machine learning is explicit," said Stephen Hayes, a partner at Relman Colfax PLLC and a former CFPB senior counsel.